Recognizing Dysarthric Speech due to Amyotrophic Lateral Sclerosis with Across-Speaker Articulatory Normalization

نویسندگان

  • Seongjun Hahm
  • Daragh Heitzman
  • Jun Wang
چکیده

Recent dysarthric speech recognition studies using mixed data from a collection of neurological diseases suggested articulatory data can help to improve the speech recognition performance. This project was specifically designed for the speakerindependent recognition of dysarthric speech due to amyotrophic lateral sclerosis (ALS) using articulatory data. In this paper, we investigated three across-speaker normalization approaches in acoustic, articulatory, and both spaces: Procrustes matching (a physiological approach in articulatory space), vocal tract length normalization (a data-driven approach in acoustic space), and feature space maximum likelihood linear regression (a model-based approach for both spaces), to address the issue of high degree of variation of articulation across different speakers. A preliminary ALS data set was collected and used to evaluate the approaches. Two recognizers, Gaussian mixture model (GMM) hidden Markov model (HMM) and deep neural network (DNN) HMM, were used. Experimental results showed adding articulatory data significantly reduced the phoneme error rates (PERs) using any or combined normalization approaches. DNN-HMM outperformed GMM-HMM in all configurations. The best performance (30.7% PER) was obtained by triphone DNN-HMM + acoustic and articulatory data + all three normalization approaches, a 15.3% absolute PER reduction from the baseline using triphone GMM-HMM + acoustic data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward Phonetic Intelligibility Testing in Dysarthria the Concept of Speaker Intelligibility

The measurement of intelligibility in dysarthric individuals is a major concern in clinical assessment and management and in research on dysarthria. The measurement objective is complicated by the fact that intelligibility is not an absolute quantity but rather a relative quantity that depends on variables such as test material, personnel, training, test procedures, and state of the speaker. Th...

متن کامل

The TORGO database of acoustic and articulatory speech from speakers with dysarthria

This paper describes the acquisition of a new database of dysarthric speech in terms of aligned acoustics and articulatory data. This database currently includes data from seven individuals with speech impediments caused by cerebral palsy or amyotrophic lateral sclerosis and ageand gender-matched control subjects. Each of the individuals with speech impediments are given standardized assessment...

متن کامل

Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks

Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks Suhas Pillai, M.S. Rochester Institute of Technology, 2017 Supervisor: Dr. Raymond Ptucha Millions of people around the world are diagnosed with neurological disorders like Parkinsons, Cerebral Palsy or Amyotrophic Lateral Sclerosis. Due to the neurological damage as the disease progresses, the person s...

متن کامل

Spontaneous speech production by dysarthric and healthy speakers: Temporal organisation and speaking rate

This study compares speaking rate in spontaneous speech between dysarthric and healthy speakers. Since dysarthria involves heterogeneous pathologies, two types of dysarthria (i.e. Parkinson’s disease and amyotrophic lateral sclerosis) have been distinguished. We hypothesize that temporal organisation of speech may be different between healthy and dysarthric speakers, but also between both patho...

متن کامل

Across-speaker articulatory normalization for speaker-independent silent speech recognition

Silent speech interfaces (SSIs), which recognize speech from articulatory information (i.e., without using audio information), have the potential to enable persons with laryngectomy or a neurological disease to produce synthesized speech with a natural sounding voice using their tongue and lips. Current approaches to SSIs have largely relied on speaker-dependent recognition models to minimize t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015